WHAT IS CLAIMED IS: 




A system for 
speech data 
an enhanced 
a transcriptidn 



appropriate 



speech processing, comprising: 
g enerated from one or more speech sources; 
phone set; and 

generated by a transcription process that selects 

phones from said enhanced phone set to represent 
said spjeech data. 



10 2. The system of claim 1, further comprising a phone dataset that 
includes said speech data and said transcription. 



3. The system of claim 2, wherein said phone dataset is utilized in 
training a speech recognizer. 

15 

4. The system of claim 2, wherein said phone dataset is utilized in 
building a phonetic dictionary. 

5. The sVstem of claim 2, further comprising transformation rules applied 
to said phona dataset to produce a transformed phone dataset, said 
transformed pmone dataset being for use in training a speech recognizer. 

\ v 

6. The system of claim 2, further comprising transformation rules applied 
to said phone dataset to ^produce a transformed phone dataset, said 

25 transformed phone datasqt being for use in building a phonetic dictionary. 

7. The system of claim 1, where said enhanced phone set includes a 
TIMIT base-phone set and an extended base-phone set. 



30 
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8. The system of claim 7, wherein said extended base-phone set includes 
base-phones for representing one of a glottal stop variation, a multiple burst 
release, a fricative consonant closure, a vowel velarization, a vowel 
lateralization, an R-coloring, a glide loss, an R-deletion, a labio-velar fricative, 

5 and an articulator noise. 

9. The system of claim 7, wherein said enhanced phone set includes 
acoustic-phonetic symbols, said acoustic-phonetic symbols being utilized in 
said transcription process to represent acoustic-phonetic processes of said 

lO speech data. 

10. They system of claim 9, wherein said enhanced phone set further 
includes /connectors used in said transcription process to connect said 
acoustic-phonetic symbols to base-phones affected by acoustic-phonetic 

15 processes, thereby producing composite-phones. 

1 1. The system of claim 10, wherein said connectors indicate how and 
where said acoustic-phonetic processes affect said base-phones. 

20 12. The system of claim 1 1 , wherein said connectors include a character 
">" that is placed to the left of one of said base-phones to indicate that one of 
said acoustic-phonetic processes affects a beginning of said one of said base- 
phones. 



25 13. The system of claim 12, wherein said character ">" is placed to the left 
of one of said composite-phones to indicate that one of said acoustic-phonetic 
processes affects a beginning of said one of said composite-phones. 

14. The system of claim 11, wherein said connectors include a character 
30 that is placed to the right of one of said base-phones to indicate that one 

of said acoustic-phonetic processes affects an ending of said one of said base- 
phones. 
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15. The system of claim 14, wherein said character is placed to the 
right of one of said composite-phones to indicate that one of said acoustic- 
phonetic processes affects an ending of said one of said composite-phones. 

5 

16. The system of claim 1 1, wherein said connectors include a character 
that is placed to the right of one of said base-phones to indicate that one 

of said acoustic-phonetic processes affects an entirety of said one of said 
base-phones. 

10 

17. The system of claim 16, wherein said character is placed to the 
right of one of said composite-phones to indicate that one of said acoustic- 
phonetic processes affects an entirety of said one of said composite-phones. 

15 18. The system of claim 1 1, wherein said connectors include a character 
" A " that is placed to the right of one of said base-phones to indicate that one 
of said acoustic-phonetic processes occurred completely within said one of 
said base-phones. 

20 19. The system of claim 18, wherein said character " A " is placed to the 
right of one of said composite-phones to indicate that one of said acoustic- 
phonetic processes occurred completely within said one of said composite- 
phones. 



"25/ 20. The system of claim 9, wherein said acoustic-phonetic content 



a glottalization variance, a breathiness, a labialization, a palatalization, a 
voicing, a devoicing, a voiced frication, a low frequency voiceless frication, a 
high frequency voiceless frication, an epenthetic vowel, a murmur, an air 
30 puff, a burst qualitjr, an approximation, an absence of a burst/ release, and a 
tongue click. \ 




represented by aaid acoustic-phonetic symbols includes one of a nasalization, 
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21. The system of claim 5, wherein said transformation rules include 
merge-type transformation rules that combine two adjacent phones in said 
phone dataset into a single phone selected from said enhanced phone set. 

5 22. The system of claim 5, wherein said transformation rules include split- 
type transformation rules that separate one phone in said phone dataset into 
two different phones selected from said enhanced phone set. 

23. The system of claim 5, wherein said transformation rules include 
10 replace-type transformation rules that replace one phone in said phone 

dataset with a different phone selected from said enhanced phone set. 

24. The system of claim 5, wherein said transformation rules include 
change in context-type transformation rules that change one phone in said 

15 phone dataset to a different phone selected from said enhanced phone set 
depending on context. 



25 26. The method of claim 25, further comprising the step of combining said 
speech data and said transcription to produce a phone dataset. 

27. The method of claim 26, wherein said phone dataset is utilized in 
training a speech recognizer. 




A methofl for speech processing, comprising the steps of: 
generating speech data from one or more speech sources; 
providinglan enhanced phone set; and 

producing^ a transcription using a transcription process that selects 



appropriate phones from said enhanced phone set to represent 



said speech data. 



28. The method of claim 26, wherein said phone dataset is utilized in 
building a phonetic dictionary. 
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9. The method of claim 26, further comprising the step of applying 
transformation rules said phone dataset to produce a transformed phone 
dataset, said transformed phone dataset being for use in training a speech 
recognizer. 



30. The metho 
transformatio 
dataset, said 
dictionary. 




claim 26, further comprising the step of applying 

phone dataset to produce a transformed phone 
phone dataset being for use in building a phonetic 



31. The method of claim 25, where said enhanced phone set includes a 
TIMIT base-phone set and an extended base-phone set. 

32. ThV method of claim 31, wherein said extended base-phone set 
includes b\se-phones for representing one of a glottal stop variation, a 
multiple bu^st release, a fricative consonant closure, a vowel velarization, a 
vowel lateralization, an R-coloring, a glide loss, an R-deletion, a labio-velar 
friacative, and an articulator noise. 

33. The method of claim 31, wherein said enhanced phone set includes 
acoustic-pmonetic symbols, said acoustic-phonetic symbols being utilized in 
said transcription process to represent acoustic-phonetic processes of said 
speech da/ta. 



34. Thfe method of claim 33, wherein said enhanced phone set further 
includes connectors used in said transcription process to connect said 
acoustic-phonetic symbols to base-phones affected by acoustic-phonetic 
processes, thereby producing composite-phones. 



35. The method of claim 34, wherein said connectors indicate how and 
where said acoustic-phonetic processes affect said base-phones. 
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36. The method of claim 35, wherein said connectors include a character 
">" that is placed to the left of one of said base-phones to indicate that one of 
said acoustic-phonetic processes affects a beginning of said one of said base- 

5 phones. 

37. The method of claim 36, wherein said character ">" is placed to the left 
of one of said composite-phones to indicate that one of said acoustic-phonetic 
processes affects a beginning of said one of said composite-phones. 

10 

38. The method of claim 35, wherein said connectors include a character 
that is placed to the right of one of said base-phones to indicate that one 

of said acoustic-phonetic processes affects an ending of said one of said base- 
phones. 

15 

39. The method of claim 38, wherein said character is placed to the 
right of one of said composite-phones to indicate that one of said acoustic- 
phonetic processes affects an ending of said one of said composite-phones. 

20 40. The method of claim 35, wherein said connectors include a character 
that is placed to the right of one of said base-phones to indicate that one 
of said acoustic-phonetic processes affects an entirety of said one of said 
base-phones. 

25 41. The method of claim 40, wherein said character is placed to the 
right of one of said composite-phones to indicate that one of said acoustic- 
phonetic processes affects an entirety of said one of said composite-phones. 

42. The method of claim 35, wherein said connectors include a character 
30 " A " that is placed to the right of one of said base-phones to indicate that one 
of said acoustic-phonetic processes occurred completely within said one of 
said base-phones. 
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43. The method of claim 42, wherein said character " A " is placed to the 
right of one of said composite-phones to indicate that one of said acoustic- 
phonetic processes occurred completely within said one of said composite- 
5 phones. 

\ — 744. iftie method of claim 33, wherein said acoustic-phonetic content 
V / represented by said acoustic-phonetic symbols includes one of a nasalization, 

a glottaliz&tion variance, a breathiness, a labialization, a palatalization, a 
10 voicing, a uevoicing, a voiced frication, a low frequency voiceless frication, a 
high frequency voiceless frication, an epenthetic vowel, a murmur, an air 
puff, a bursty quality, an approximation, an absence of a burst/ release, and a 
tongue click. 

15 45. The method of claim 29, wherein said transformation rules include 
merge-type transformation rules that combine two adjacent phones in said 
phone dataset into a single phone selected from said enhanced phone set. 



46. The method of claim 29, wherein said transformation rules include 

20 split-type transformation rules that separate one phone in said phone dataset 
into two different phones selected from said enhanced phone set. 

47. The method of claim 29, wherein said transformation rules include 
replace-type transformation rules that replace one phone in said phone 

25 dataset with a different phone selected from said enhanced phone set. 



48. The method of claim 29, wherein said transformation rules include 
change in context-type transformation rules that change one phone in said 
phone dataset to a different phone selected from said enhanced phone set 
30 depending on context. 
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4^ A system for speech processing, comprising: 
means for generating speech data; 
means for providing an enhanced phone set; and 

means for producing a transcription using a transcription process that 
5 selects appropriate phones from said enhanced phone set to 

represent said speech data. 

A computer-readable medium comprising program instructions for 
speech processing, by performing the steps of: 
10 generating speech data from one or more speech sources; 

providing an enhanced phone set; and 

producing a transcription using a transcription process that selects 
appropriate phones from said enhanced phone set to represent 
said speech data. 





28 



